Project Objective:

To build a recommendation system using

  1. Popularity based method and
  2. Collaborative filtering method to recommend mobile phones to a user which are most popular and personalised respectively

Importing required libraries.

Merging the provided CSVs into one data-frame

Observations

score and score_max, has most and equal numbers of Nan values

Round off scores to the nearest integers

5 point summary defiened given as follows:

  1. The minimum.
  2. Q1 (the first quartile, or the 25% mark).
  3. The median (50%).
  4. Q3 (the third quartile, or the 75% mark).
  5. The maximum.

From above table, it is clear that we can replace score_max with any of mean or median, as all values are same.

Check for duplicate values and remove them if there is any

na=[] a=ndf['author'] for x in a: if type(x) != float:

    #print(type(x))
    if x.isascii() != True:
        #ndf['author'].drop(x,axis=1,inplace=False)
        ndf.drop(ndf[ndf['author'] == x].index, inplace = True) 
        na.append(x)
        #print(x.index.tolist())
        print(x)

naa=[] a=ndf['author'] for x in a: if type(x) != float and x.isascii() != True: ndf.drop(ndf[ndf['author'] == x].index, inplace = True) naa.append(x)

    #print(x.index.tolist())
    print(x)

Keep only 1000000 data samples. Use random state=612.

Drop irrelevant features. Keep features like Author, Product, and Score

Identify the most rated features.

Top buying cellphones is 'Lenovo Vibe K4 Note (White,16GB)', from source 'Amazon', from 'US' country, with 'EN' as language

Identify the users with most number of reviews.

Select the data with products having more than 50 ratings and users who have given more than 50 ratings.

Report the shape of the final dataset.

Report the shape of the final dataset.

Build a popularity based model and recommend top 5 mobile phones.

Build a collaborative filtering model using SVD.

You can use SVD from surprise or build it from scratch(Note: Incase you’re building it from scratch you can limit your data points to 5000 samples if you face memory issues).

Build a collaborative filtering model using kNNWithMeans from surprise.

You can try both user-based and item-based model.

Collabrative filtering model

1. Using SVD

2. Using knnwith means from surprise for Item based

3. Using knnwith means from surprise for user based

1. SVD

2. knnwith means from surprise for Item based recommendation system

3. knnwith means from surprise for user based recommendation system

Evaluate the collaborative model. Print RMSE value.

Predict score (average rating) for test users

To measure the accuracy is Mean Absolute Error (MAE)

Report your findings and inferences.

From the RMSE values we identify that SVD model is the best model as it has less RMSE value amoung all Model

Try and recommend top 5 products for test users.

Check for outliers and impute them as required.

As there is no outlier so no imputation needed

Try cross validation techniques to get better results.

From the CV RMSE values, again we identified that SVD model is the best model amoung all Model

So SVD model can be chosen best model for recommendation system

In what business scenario you should use popularity based Recommendation Systems ?

An increasing number of online companies are utilizing recommendation systems to increase user interaction and enrich shopping potential. One of the core potential benefits of recommendation systems is their ability to continuously calibrate to the preferences of the user.

Popularity based recommendation systems works on the principle of popularity and or anything which is in trend.

Example

  1. Google News: News filtered by trending and most popular news.
  2. YouTube: Trending videos.
  3. News Websites
  4. Computer Games
  5. Knowledge Bases
  6. Social Media Platforms
  7. Stock Trading Support Systems
  8. Job Portals
  9. Super Markets

In what business scenario you should use CF based Recommendation Systems ?

Collaborative filtering is a technique that can filter out items that a user might like on the basis of reactions by similar users.

It works by searching a large group of people and finding a smaller set of users with tastes similar to a particular user. It looks at the items they like and combines them to create a ranked list of suggestions.

  1. Benefits users in finding items of their interest.
  2. Help item providers in delivering their items to the right user.
  3. Identity products that are most relevant to users.
  4. Personalized content.
  5. Help websites to improve user engagement.

Example

  1. Amazon
  2. Spotify
  3. YouTube
  4. Google Ads
  5. Super Markets
  6. Google Play Store
  7. Swiggy, Zomato

User-item%20matrix.JPG

What other possible methods can you think of which can further improve the recommendation for different users ?

  1. Ditch Your User-Based Collaborative Filtering Model.
  2. A Gold Standard Similarity Computation Technique.
  3. Boost Your Algorithm Using Model Size.
  4. What Drives Your Users, Drives Your Success.